SLIP

 

SLIP Technology Browser Exercise III

 

November 19, 2001

 

 

 

 

Obtaining Informational Transparency with Selective Attention

 

 

 

Dr. Paul S. Prueitt

President, OntologyStream Inc

November 19, 2001

 

 


 

 

 

SLIP Technology Browser Exercise III

{s_port, d_port}

 

November 19, 2001

 

One needs the WinZip file, dSLIP

 

Review:

 

Our analytic conjecture is established by setting the “b” values to source port and the “a” values to defensive port.  In the SLIP Warehouse Browser, under development, the process of developing the analytic conjecture will be facilitated.

 

The conjecture is that the non-specific relationship, r, will gather the defensive port values into some categories that reveal something about the global events that where going on during the period that an IDS system is detecting intrusion events.  The IDS event log is input into the Warehouse Browser to produce the source files needed by the Technology Browser.

 

 

Figure 1: The analytic conjecture

 

 

Formally we have:

( a1 , b ) +  ( a2 , b )  ŕ   < a1 , r,  a2 >

 

where r is the non-specific relationship. 

 

The “b” values are from one column in the intrusion event log and the “a” values are from a second column in the intrusion event log. 

 

We call “b” the “first name” and “a” the “second name”.  The set { a } define the sets of atoms that are categorized. 

 

The set { b } provides means to define the incident level events that result from the SLIP emergent computing technique.  Exercise II focused on this, but the techniques related to automatically organizing event maps are still under development.  Exercise III continues this focus.

 

The incident map is often incomplete but can be used to begin the process of developing a model for each of a number of currently occurring global incident events.  Taken together, automated means create small topic maps with specific relationships that the domain experts (working within their security environment) may use to profile the global events of interest.  

 

 

Figure 2: A graphic depicting the construction and display of the event map

 

An event type is something that we can profile because the events in the event record reoccur with sufficient similarity so as to be recognized as being an instance of the event type. 

 

SLIP generated events maps may in fact be major components of complete incident events types – but not the complete event type.   The issue is that the emergent computing is being asked to produce a fractionation (parcelation or categorization) that is crisp, while the events themselves have inexact boundaries and overlap each other. 

 

There are “event fidelity” issues to consider.  The IDS event log will have various degrees of completeness (having all aspects of the intrusions detected) and consistency (having only those local events detected that are related to a single global event.) 

 

The Warehouse Browser is being designed to take care of some of the fidelity issues.  However it must be understood that a complete system can be developed for real time incident event monitoring only with all three of the browsers.  The Enterprise Browser is needed to archive the work product and allow the community to discuss the universe of global event that are occurring over any one period of history and to anticipate new kinds of event types based on vulnerability studies. 

 

We need to look at the small clusters at the bottom of a standard construction of a slip framework, where all of the major clusters are removed on the first pass and then the re-cluster process works on the remnants of what is left.  What we expect from this is a collection of partial events that can be put together by domain specialists using a event map graph. 

 

The computational machinery to enable a domain expert to do develop event maps from partial event maps is being developed.  However, in Exercise III we take a different approach and try to set get the complete event. 

 

A categorical min max problem is worked out in general terms, in formal theorems, and will be explored a bit more in this exercise set.  The formal theorems may lead to a body of knowledge similar to that of rough set theory or fuzzy set theory.  (Currently there is one PhD dissertation (at Swinburne University of Technology, Australia) being planned based on the underlying theoretical work. )

 


The Example:

 

Let the first column, (b), be the port used by the source of the attack (s_port) and the second column, (a), be the defensive port (d_port).   There are 764 atoms in the top-level category A1 of the associated SLIP Framework. 

 

The set of paired d_port values has 32,927 paired values, each part of the pair being a port value.  The pairs are defined through the analytic conjecture graph, Figure 1.

 

Pre-exercise:  Start the SLIP.exe in a folder with a folder named ‘data”.  You need only have two text files to start with.

 

1)       Paired.txt is the file containing the 32,927 pairs of port values.

2)       Datawh.txt (Data Warehouse) is the file containing the 14,475 RealSecure summary events records. 

 

However for this exercise we start with some data that is contained in nested files.  Once starting the SLIP.exe one will see the structure in Figure 3a.

 

The idea is that some of the atoms are not really part of the major events.  We might remove these atoms first.  But how?

 

 

a                                                                                     b

Figure 3: A1 and the Fourth Residue

 

Before reading further, perhaps you should speculate on how to remove atoms that are not part of the three major clusters seen in Figure 3a.  Figure 3a has 794 atoms.  Figure 3b has 454 atoms.  The major qualitative difference is that Figure 3a has many atoms that are outside of the three clusters and Figure 3b clusters rather quickly into three clusters with no (or little residue.)  We have removed almost 50% of the atoms without effecting the composition of the main events.

 

B1 and B2 is from the circle segment 0, 100 and 170,360.  As we move down the layers, we are removing (each time) the complement of the main clusters.  At the B layer, we had to do this in two steps because the Browser does not yet allow one to specify a segment having 0 in the interior. 

 

What we must be careful with is making sure that atoms are not in transit, through the regions removed, and moving from one large cluster to another.  In Figure 3a the clusters are close together, but early in the process one can see that there is motion in the complement region.  We have to iterate until this motion stops.  At about 3,000 (times 1,000) iterations this behavior can be seen.  At this point, we can still see a lot of motion in the clusters and between the clusters. 

 

The idea is that by repeatedly taking this “stationary part away, and reclustering, we will reveal the same three clusters but with less clutter from atoms that are not really involved with the main events. 

 

Remember that the original event log can be selected to be specific to an event that has been determined by other means.  We simply wish to remove data that is not linked by the link analysis, and reveal an event map of the already identified event.  The proper selection of the source intrusion event log will help in automating the identification of characteristic features of the global incident event that we wish to understand and inventory. 

 

     

a                                                                                     b

Figure 4: Second and Third Residue

 

One can observe the similarity of the cluster patters from one level to the next if one removes only the atoms that are stationary and outside the main clusters. 

 

 

a                                                                                     b

Figure 5:  One of the prime structures after removing non-relevant atoms

 

The SLIP Browsers are general-purpose tools that allow the domain experts a great deal of flexibility in what they do with the tool.  The notion that we want prime structures after the removal of non-relevant atoms just comes from some experience with the SLIP theory.  The concept that a global event type will have characteristic features is immediate.  Of course event types, at whatever level, will have characteristic features.  What we are looking for are the characteristic features of the global events that were of importance to the IDS system RealSecure on April 15 2001.  An experienced domain expert will likely figure out new ways to use the Technology Browser. 

 

The Report for the category F1 is displayed by clicking on the Report button.  We see a report with all, but one event, a Port_Scan to defensive address 204.208.170.77 from source IP 218.250.225.201.  There are around 150 of these.  The same s-port (9100) is used and the d-port values ranges from the 500 to the 1100.  The fact that s_port 9100 is involved in the analytic conjecture with all of these d_ports is why they are all tightly linked.  

 

But what is it about the first event in the Report?  It is a Stream_DoS with d_port 1863 and s_port 1248.  Why is this atom there?  Randomizing and reclustering a few times will show that the atom 1248 is not part of the prime.  In Figure 5b we point to the outlier using the indicator line. 

 

As an exercise we leave it to the user to select R4 and pull out some additional primes. 

 

Here are some suggestions.

 

A.1: Use the command “mag 10 to make the magnification proper.  (We need to make the default mag at 10). 

 

A.2: At this point, one does not know from the data that is displayed which cluster became F1 since the cluster structure has been randomized and clustered.  Choose one of the atoms in F1 and find this atom in R4.  Now point to the degree that the atom is at in R4.  This is the cluster that was used to create F1.  Check some of the other atoms.  (One sees that the cluster is at 126 degrees.) 

 

A.3: Take the circle segment 347,360 and create F1 (use “347,360 -> F2”).  This creates a category with 137 atoms. 

 

A.4: Type in “gen” to generate the Report.  This is a slow process still, but will take only about 2 mins.  Notice that Response Messages count the number of atoms processed.     1985 reports are created.  (The browser will not display this many records yet, but you can open up the data folder and look at the record file. 

 

A.5:  The other cluster is a large one that seems to have two parts, so perhaps create a residue for the F layer and re-cluster.  To create a residue at the F layer click on the R4 node and type Residue R5, to give it the name “R5”.  (Naming the residue at each level R, which is the default still gets the browser confused sometimes). 

 

A.6: After getting a residue (mine has 230 elements in it) then re-cluster and see if the two structures see in R4 separate.  (Type in random and then cluster 40. )

 

Having completed the above steps we can how delete the data structure and redevelop the SLIP Framework. 

 

B.1:  Delete the A1 folder and all of its contents.  There are nested folders that are read by the Browser and all of these will be deleted.  You need only have Paired.txt, a file containing the 32,927 pairs of port values and Datawh.txt (Data Warehouse). 

 

B.2: Redevelop several levels where the iterations are carried out long enough to remove stationary regions that are outside of main clusters.

 

B.3: Produce three Primes and generate the Reports.  These three primes will be almost (or exactly) the primes that where just deleted.

 

B.4: Inspect the Report files using a work processor. 

 

Items under remedial development (bugs)

 

A number of small issues are to be corrected before the next release of the SLIP Technology Browser

1)       Bracketing across 0.

2)       Allowing the use of the same name (for example R) at more than one level of the Framework.

3)       Showing the Report even if the Report file is long (say up to 20 pages)

4)       Ordering the report by any column

5)       Align the report columns…